LAF-Fabric: a data analysis tool for Linguistic Annotation Framework with an application to the Hebrew Bible
نویسندگان
چکیده
The Linguistic Annotation Framework (LAF) provides a general, extensible stand-off markup system for corpora. This paper discusses LAF-Fabric, a new tool to analyse LAF resources in general with an extension to process the Hebrew Bible in particular. We first walk through the history of the Hebrew Bible as text database in decennium-wide steps. Then we describe how LAF-Fabric may serve as an analysis tool for this corpus. Finally, we describe three analytic projects/workflows that benefit from the new LAF representation: 1) the study of linguistic variation: extract cooccurrence data of common nouns between the books of the Bible (Martijn Naaijer); 2) the study of the grammar of Hebrew poetry in the Psalms: extract clause typology (Gino Kalkman); 3) construction of a parser of classical Hebrew by Data Oriented Parsing: generate tree structures from the database (Andreas van Cranenburgh).
منابع مشابه
The Hebrew Bible as Data: Laboratory - Sharing - Experiences
The systematic study of ancient texts including their production, transmission and interpretation is greatly aided by the digital methods that started taking off in the 1970s. But how is that research in turn transmitted to new generations of researchers? We tell a story of Bible and computer across the decades and then point out the current challenges: (1) finding a stable data representation ...
متن کاملA standardized general framework for encoding and exchange of corpus annotations: The Linguistic Annotation Framework, LAF
The Linguistic Annotation Framework, LAF, proposes a generic data model for exchange of linguistic annotations and has recently become an ISO standard (ISO 24612:2012). This paper describes some aspects of LAF, its XML-serialization GrAF and some use-cases related to the framework. While GrAF has already been used as exchange format for corpora with several annotation layers, such as MASC and O...
متن کاملAn RDF Realisation of LAF in the DADA Annotation Server
The Linguistic Annotation Framework defines a generalised graph based model for annotation data intended as an interchange format for transfer of annotations between tools. The DADA system uses an RDF based representation of annotation data and provides a web based annotation store. The annotation model in DADA can be seen as an RDF realisation of the LAF model. This paper describes the relatio...
متن کاملOff-Road LAF: Encoding and Processing Annotations in NLP Workflows
The Linguistic Annotation Framework (LAF) provides an abstract data model for specifying interchange representations to ensure interoperability among different annotation formats. This paper describes an ongoing effort to adapt the LAF data model as the interchange representation in complex workflows as used in the Language Analysis Portal (LAP), an on-line and large-scale processing service th...
متن کاملA model oriented approach to the mapping of annotation formats using standards
In this paper, we present, Salt, a framework for mapping heterogeneous linguistic annotation formats into each other using a model-based approach, i.e. independently of the actual formats in which the corresponding linguistic data is being expressed. As we describe the underlying concept of this framework, we identify how it echoes ongoing standardisation activities within ISO committee TC 37/S...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1410.0286 شماره
صفحات -
تاریخ انتشار 2014